Dereferenceable Uniform Resource Identifier

A dereferenceable Uniform Resource Identifier or dereferenceable URI is a resource retrieval mechanism that uses any of the internet protocols (e.g. HTTP) to obtain a copy or representation of the resource it identifies.

In the context of traditional HTML web pages, this is the normal and obvious way of working: A URI refers to the page, and when requested the web server returns a copy of it. In other non-dereferenceable contexts, such as XML Schema, the namespace identifier is still a URI, but this is simply an identifier (i.e. a namespace name). There is no intention that this can or should be dereferenced. There is even a separate attribute, schemaLocation, which may contain a dereferenceable URI that does point to a copy of the schema document.

In the case of Linked Data, the representation takes the form of a document (typically HTML or XML) that describes the resource that the URI identifies. In either case, the mechanism makes it possible for a user (or software agent) to "follow your nose" to find out more information related to the identified resource.

Contents

Background

In computing, identifiers are used to distinguish things and to facilitate data exchange. For example, two U.S. citizens of the same name would have different SSN. In a totally distributed system, such as the World Wide Web, a URI is used to globally identify a thing in the world. Unfortunately, because the architecture and decision is made for HTTP, URIs often identify the web pages instead of the underlying thing. To remove this confusion, URIs that identify things often include a hash (see the following section). The following example shows the difference of a URL of a person (which usually means his/her homepage) and a URI of a person:

Because of the nature of a URI, it can be dereferenced to get the information of the thing it represents—hence the term dereferenceable URI. SSN and a person's name are not dereferenceable because, even though you could google these strings, it is not guaranteed that the information exists and is unambiguous. In other words, there is no canonical way of dereferencing those identifiers. On the other hand, URIs can be dereferenced by standardized protocol such as HTTP.

Dereferenceable URIs are based on the well-established theory and practices of "data access by reference". A data access and manipulation mechanism is used extensively in general computer programming (e.g., C/C++ pointers) and database call level interfaces (e.g., ODBC and JDBC) amongst others. The term: dereferencing describes the act of obtaining a representation of a description of an entity via its URI.

In the Semantic Web realm, dereferenceable URIs offer the critical fabric that drive the Giant Global Graph of interconnected data popularly referred to as Linked Data, another term coined by Tim Berners-Lee in his Linked Data Design Note[1] and furthered by other articles such as "Cool URIs for the Semantic Web" by Sauermann and Cyganiak.[2]

Eventually everything will have its dereferenceable URI,[3] but things that already have URIs and described in interoperable way at this moment are:

Formats

Dereferenceable URIs are constructed using one of two forms: Hash or a Slash. The critical thing about either format is the underlying use of existing Web architecture to preserve the implicit identity (or pointer) function.

Hash URI example

Entity Berlin: http://linkeddata.openlinksw.com/about/Berlin#this

Slash URI example

Entity Berlin: http://dbpedia.org/resource/Berlin

Summary

In summary we can establish the following facts:

References

  1. ^ Berners-Lee, Tim (2006), Design Note: Linked Data, W3C, http://www.w3.org/DesignIssues/LinkedData.html, retrieved 2008-07-21 
  2. ^ Sauermann, Leo; Cyganiak, Richard (2008), Cool URIs for the Semantic Web, W3C, http://www.w3.org/TR/cooluris/, retrieved 2008-07-21 
  3. ^ Berners-Lee, Tim (2006), Give yourself a URI, DIG, http://dig.csail.mit.edu/breadcrumbs/node/71, retrieved 2009-01-14 

Further reading